43 research outputs found

    The potential of text mining in data integration and network biology for plant research : a case study on Arabidopsis

    Get PDF
    Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology research in general and for network biology in particular using a state-of-the-art text mining system applied to all PubMed abstracts and PubMed Central full texts. We present extensive evaluation of the textual data for Arabidopsis thaliana, assessing the overall accuracy of this new resource for usage in plant network analyses. Furthermore, we combine text mining information with both protein-protein and regulatory interactions from experimental databases. Clusters of tightly connected genes are delineated from the resulting network, illustrating how such an integrative approach is essential to grasp the current knowledge available for Arabidopsis and to uncover gene information through guilt by association. All large-scale data sets, as well as the manually curated textual data, are made publicly available, hereby stimulating the application of text mining data in future plant biology studies

    Flowering of strict photoperiodic Nicotiana varieties in non-inductive conditions by transgenic approaches

    Get PDF
    The genus Nicotiana contains species and varieties that respond differently to photoperiod for flowering time control as day-neutral, short-day and long-day plants. In classical photoperiodism studies, these varieties have been widely used to analyse the physiological nature for floral induction by day length. Since key regulators for flowering time control by day length have been identified in Arabidopsis thaliana by molecular genetic studies, it was intriguing to analyse how closely related plants in the Nicotiana genus with opposite photoperiodic requirements respond to certain flowering time regulators. SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1 (SOC1) and FRUITFULL (FUL) are two MADS box genes that are involved in the regulation of flowering time in Arabidopsis. SOC1 is a central flowering time pathway integrator, whereas the exact role of FUL for floral induction has not been established yet. The putative Nicotiana orthologs of SOC1 and FUL, NtSOC1 and NtFUL, were studied in day-neutral tobacco Nicotiana tabacum cv Hicks, in short-day tobacco N. tabacum cv Hicks Maryland Mammoth (MM) and long-day N. sylvestris plants. Both genes were similarly expressed under short- and long-day conditions in day-neutral and short-day tobaccos, but showed a different expression pattern in N. sylvestris. Overexpression of NtSOC1 and NtFUL caused flowering either in strict short-day (NtSOC1) or long-day (NtFUL) Nicotiana varieties under non-inductive photoperiods, indicating that these genes might be limiting for floral induction under non-inductive conditions in different Nicotiana varietie

    Nonrandom divergence of gene expression following gene and genome duplications in the flowering plant Arabidopsis thaliana

    Get PDF
    BACKGROUND: Genome analyses have revealed that gene duplication in plants is rampant. Furthermore, many of the duplicated genes seem to have been created through ancient genome-wide duplication events. Recently, we have shown that gene loss is strikingly different for large- and small-scale duplication events and highly biased towards the functional class to which a gene belongs. Here, we study the expression divergence of genes that were created during large- and small-scale gene duplication events by means of microarray data and investigate both the influence of the origin (mode of duplication) and the function of the duplicated genes on expression divergence. RESULTS: Duplicates that have been created by large-scale duplication events and that can still be found in duplicated segments have expression patterns that are more correlated than those that were created by small-scale duplications or those that no longer lie in duplicated segments. Moreover, the former tend to have highly redundant or overlapping expression patterns and are mostly expressed in the same tissues, while the latter show asymmetric divergence. In addition, a strong bias in divergence of gene expression was observed towards gene function and the biological process genes are involved in. CONCLUSION: By using microarray expression data for Arabidopsis thaliana, we show that the mode of duplication, the function of the genes involved, and the time since duplication play important roles in the divergence of gene expression and, therefore, in the functional divergence of genes after duplication

    Predicting protein-protein interactions in Arabidopsis thaliana through integration of orthology, gene ontology and co-expression

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Large-scale identification of the interrelationships between different components of the cell, such as the interactions between proteins, has recently gained great interest. However, unraveling large-scale protein-protein interaction maps is laborious and expensive. Moreover, assessing the reliability of the interactions can be cumbersome.</p> <p>Results</p> <p>In this study, we have developed a computational method that exploits the existing knowledge on protein-protein interactions in diverse species through orthologous relations on the one hand, and functional association data on the other hand to predict and filter protein-protein interactions in <it>Arabidopsis thaliana</it>. A highly reliable set of protein-protein interactions is predicted through this integrative approach making use of existing protein-protein interaction data from yeast, human, <it>C. elegans </it>and <it>D. melanogaster</it>. Localization, biological process, and co-expression data are used as powerful indicators for protein-protein interactions. The functional repertoire of the identified interactome reveals interactions between proteins functioning in well-conserved as well as plant-specific biological processes. We observe that although common mechanisms (e.g. actin polymerization) and components (e.g. ARPs, actin-related proteins) exist between different lineages, they are active in specific processes such as growth, cancer metastasis and trichome development in yeast, human and Arabidopsis, respectively.</p> <p>Conclusion</p> <p>We conclude that the integration of orthology with functional association data is adequate to predict protein-protein interactions. Through this approach, a high number of novel protein-protein interactions with diverse biological roles is discovered. Overall, we have predicted a reliable set of protein-protein interactions suitable for further computational as well as experimental analyses.</p

    The gain and loss of genes during 600 million years of vertebrate evolution

    Get PDF
    BACKGROUND: Gene duplication is assumed to have played a crucial role in the evolution of vertebrate organisms. Apart from a continuous mode of duplication, two or three whole genome duplication events have been proposed during the evolution of vertebrates, one or two at the dawn of vertebrate evolution, and an additional one in the fish lineage, not shared with land vertebrates. Here, we have studied gene gain and loss in seven different vertebrate genomes, spanning an evolutionary period of about 600 million years. RESULTS: We show that: first, the majority of duplicated genes in extant vertebrate genomes are ancient and were created at times that coincide with proposed whole genome duplication events; second, there exist significant differences in gene retention for different functional categories of genes between fishes and land vertebrates; third, there seems to be a considerable bias in gene retention of regulatory genes towards the mode of gene duplication (whole genome duplication events compared to smaller-scale events), which is in accordance with the so-called gene balance hypothesis; and fourth, that ancient duplicates that have survived for many hundreds of millions of years can still be lost. CONCLUSION: Based on phylogenetic analyses, we show that both the mode of duplication and the functional class the duplicated genes belong to have been of major importance for the evolution of the vertebrates. In particular, we provide evidence that massive gene duplication (probably as a consequence of entire genome duplications) at the dawn of vertebrate evolution might have been particularly important for the evolution of complex vertebrates

    A guide to CORNET for the construction of coexpression and protein-protein interaction networks

    No full text
    To enable easy access and interpretation of heterogenous and scattered data, we have developed a user-friendly tool for data mining and integration in Arabidopsis thaliana, designated CORrelation NETworks (acronym CORNET), allowing browsing of microarray data, construction of coexpression and protein–protein interactions (PPIs), analysis of gene association and transcription factor (TF) regulatory networks, and exploration of diverse functional annotations. CORNET consists of three tools that can be used individually or in combination, namely, the coexpression tool, the PPI tool, and the TF tool. Different search options are implemented to enable the creation of networks centered around multiple input genes or proteins. Functional annotation resources are included to retrieve relevant literature, phenotypes, localization, gene ontology, plant ontology, and biological pathways. Networks and associated evidence of the majority of the currently available data types are visualized in Cytoscape. CORNET is available at https://bioinformatics.psb.ugent.be/cornet

    Gibberellins and DELLAs: central nodes in growth regulatory networks

    No full text
    Gibberellins (GAs) are growth-promoting phytohormones that were crucial in breeding improved semi-dwarf varieties during the green revolution. However, the molecular basis for GA-induced growth stimulation is poorly understood. In this review, we use light-regulated hypocotyl elongation as a case study, combined with a meta-analysis of available transcriptome data, to discuss the role of GAs as central nodes in networks connecting environmental inputs to growth. These networks are highly tissue-specific, with dynamic and rapid regulation that mostly occurs at the protein level, directly affecting the activity and transcription of effectors. New systems biology approaches addressing the role of GAs in growth should take these properties into account, combining tissue-specific interactomics, transcriptomics and modeling, to provide essential knowledge to fuel a second green revolution
    corecore